45 research outputs found

    Use of multiple singular value decompositions to analyze complex intracellular calcium ion signals

    Get PDF
    We compare calcium ion signaling (Ca2+\mathrm {Ca}^{2+}) between two exposures; the data are present as movies, or, more prosaically, time series of images. This paper describes novel uses of singular value decompositions (SVD) and weighted versions of them (WSVD) to extract the signals from such movies, in a way that is semi-automatic and tuned closely to the actual data and their many complexities. These complexities include the following. First, the images themselves are of no interest: all interest focuses on the behavior of individual cells across time, and thus, the cells need to be segmented in an automated manner. Second, the cells themselves have 100++ pixels, so that they form 100++ curves measured over time, so that data compression is required to extract the features of these curves. Third, some of the pixels in some of the cells are subject to image saturation due to bit depth limits, and this saturation needs to be accounted for if one is to normalize the images in a reasonably unbiased manner. Finally, the Ca2+\mathrm {Ca}^{2+} signals have oscillations or waves that vary with time and these signals need to be extracted. Thus, our aim is to show how to use multiple weighted and standard singular value decompositions to detect, extract and clarify the Ca2+\mathrm {Ca}^{2+} signals. Our signal extraction methods then lead to simple although finely focused statistical methods to compare Ca2+\mathrm {Ca}^{2+} signals across experimental conditions.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS253 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Study of Mexican Free-Tailed Bat Chirp Syllables: Bayesian Functional Mixed Models for Nonstationary Acoustic Time Series

    Get PDF
    Abstract We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure. This can be done on raw spectrograms when all signals are of the same length, and can be done using spectrograms defined on a relative time scale for signals of variable length in settings where the idea of defining correspondence across signals based on relative position is sensible

    Association of Researcher Characteristics with Views on Return of Incidental Findings from Genomic Research

    Get PDF
    Whole exome/ genome sequencing (WES/WGS) is now commonly used in research and is increasingly used in clinical care to identify the genetic basis of rare and unknown diseases. The management of incidental findings (IFs) generated through these analyses is debated within the research community. To examine how views regarding genomic research IFs are associated with researcher characteristics and experiences, we surveyed genetic professionals and assessed the effect of professional background and experience on their opinions. Researchers who did not have clinical training, provide clinical care to research participants, or have prior experience returning research results were in general more inclined to offer return of IFs than their colleagues with these characteristics. Understanding this will be important to fully appreciate the impact that policies on return of genetic IFs could have on participants, researchers, and genomic research

    Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine

    Get PDF
    Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    PSQR: A Stable and Efficient Penalized Spline Algorithm

    No full text
    We introduce an algorithm for reliably computing quantities associated with several types of semiparametric mixed models in situations where the condition number on the random effects matrix is large. The algorithm is numerically stable and efficient. It was designed to process penalized spline (P-spline) models without making unnecessary numerical approximations. The algorithm, PSQR (P-splines via QR), is formulated in terms of QR decompositions. PSQR can treat both exactly rank deficient and ill-conditioned matrices. The latter situation often arises in large scale mixed models and/or when a P-spline is estimated using a basis with poor numerical properties, e.g. a truncated power function (TPF) basis. We provide concrete examples where unnecessary numerical approximations introduce both subtle and dramatic errors that would likely go undetected, thus demonstrating the importance of using this reliable numerical algorithm. Simulation results studying a univariate function and a longitudinal data set are used to demonstrate the algorithm. Extensions and the utility of the method in more general semiparametric regression applications are briefly discussed. MATLAB scripts demonstrating implementation are provided in the Supplemental Materials

    A Study of Mexican Free-Tailed Bat Chirp Syllables: Bayesian Functional Mixed Models for Nonstationary Acoustic Time Series

    No full text
    We describe a new approach to analyze chirp syllables of free-tailed bats from two regions of Texas in which they are predominant: Austin and College Station. Our goal is to characterize any systematic regional differences in the mating chirps and assess whether individual bats have signature chirps. The data are analyzed by modeling spectrograms of the chirps as responses in a Bayesian functional mixed model. Given the variable chirp lengths, we compute the spectrograms on a relative time scale interpretable as the relative chirp position, using a variable window overlap based on chirp length. We use 2D wavelet transforms to capture correlation within the spectrogram in our modeling and obtain adaptive regularization of the estimates and inference for the regions-specific spectrograms. Our model includes random effect spectrograms at the bat level to account for correlation among chirps from the same bat, and to assess relative variability in chirp spectrograms within and between bats. The modeling of spectrograms using functional mixed models is a general approach for the analysis of replicated nonstationary time series, such as our acoustical signals, to relate aspects of the signals to various predictors, while accounting for between-signal structure

    P-Splines Using Derivative Information

    No full text
    Time series associated with single-molecule experiments and/or simulations contain a wealth of multiscale information about complex biochemical systems. However efficiently extracting and representing useful physical information from these time series measurements can be challenging. We demonstrate how Penalized splines (P-Splines) can be useful in summarizing complex single-molecule time series data using quantities estimated from the observed data. A design matrix that simultaneously uses noisy function and derivative scatterplot information to refine function estimates using P-spline techniques is introduced. The approach is called the PuDI (P-Splines using Derivative Information) method. We show how Generalized Least Squares fits seamlessly into the PuDI method; several applications demonstrating how inclusion of uncertainty information improves the PuDI function estimates are presented. The PuDI design matrix can be used to assist scatterplot smoothing applications where both unbiased function and derivative estimates are available
    corecore